Descubra por que a Netskope estreou como líder no Quadrante Mágico™ do Gartner® para Single-Vendor SASE

Leia como os clientes inovadores estão navegando com sucesso no cenário atual de mudanças na rede & segurança por meio da plataforma Netskope One.

Baixe o eBook

Saiba mais sobre os parceiros da Netskope

Grupo de diversos jovens profissionais sorrindo

Security Service Edge (SSE), Cloud Access Security Broker (CASB), Cloud Firewall, Next Generation Secure Web Gateway (SWG), and Private Access for ZTNA built natively into a single solution to help every business on its journey to Secure Access Service Edge (SASE) architecture.

Vá para a plataforma

Netskope Next Gen SASE Branch converge o Context-Aware SASE Fabric, Zero-Trust Hybrid Security e SkopeAI-Powered Cloud Orchestrator em uma oferta de nuvem unificada, inaugurando uma experiência de filial totalmente modernizada para empresas sem fronteiras.

Saiba mais sobre Next Gen SASE Branch

Obtenha sua cópia gratuita do único guia de planejamento SASE que você realmente precisará.

Baixe o eBook

Livro eletrônico SASE Architecture For Dummies (Arquitetura SASE para leigos)

Conheça a NewEdge

Rodovia iluminada através de ziguezagues na encosta da montanha

Saiba como protegemos o uso de IA generativa

Ative com segurança o ChatGPT e a IA generativa

Conheça o Zero Trust

Escolha o Netskope GovCloud para acelerar a transformação de sua agência.

Saiba mais sobre o Netskope GovCloud

Previsões para 2025
Neste episódio de Security Visionaries, temos a companhia de Kiersten Todt, presidente da Wondros e ex-chefe de gabinete da Agência de Segurança Cibernética e de Infraestrutura (CISA), para discutir as previsões para 2025 e além.

Reproduzir o podcast Navegue por todos os podcasts

Leia como a Netskope pode viabilizar a jornada Zero Trust e SASE por meio de recursos de borda de serviço de acesso seguro (SASE).

Leia o Blog

Aprenda a navegar pelos últimos avanços em SASE e confiança zero e explore como essas estruturas estão se adaptando para enfrentar os desafios de segurança cibernética e infraestrutura

Explorar sessões

Saiba mais sobre a futura convergência de ferramentas de redes e segurança no modelo predominante e atual de negócios na nuvem.

Saiba mais sobre a SASE

A Netskope tem o orgulho de participar da Visão 2045: uma iniciativa destinada a aumentar a conscientização sobre o papel da indústria privada na sustentabilidade.

Saiba mais

Apoiando a sustentabilidade por meio da segurança de dados

Na Netskope, os fundadores e líderes trabalham lado a lado com seus colegas, até mesmo os especialistas mais renomados deixam seus egos na porta, e as melhores ideias vencem.

Faça parte da equipe

Ir para Soluções para Clientes

Saiba mais sobre Treinamentos e Certificações

Grupo de jovens profissionais trabalhando

Deep Learning for Phishing Website Detection

Request Demo

Introduction

Phishing is one of the most common online security threats. A phishing website tries to mimic a legitimate page in order to obtain sensitive data such as usernames, passwords, or financial and health-related information from potential victims.

Machine learning (ML) algorithms have been used to detect phishing websites, as a complementary approach to signature matching and heuristics. They usually rely on a set of “domain knowledge” features, for example, the number of days the security certificate in the header is valid, the number of domains under the certificate, the host information, etc. However, many of the domain knowledge features are not available for inline processing, and they can be easily circumvented by sophisticated attackers.

To address the shortcomings of the domain knowledge features and detect zero-day phishing attacks in real time, at Netskope we use the latest deep learning techniques to implicitly learn the patterns of phishing websites. This includes using deep learning-based encoders on the textual content of the HTML page, Javascript and CSS code. We have been awarded three U.S. patents (Patent # 11,336,689, 11,438,377 and 11,444,978) for our innovative approach to phishing detection.

HTML Encoder

We have developed an HTML encoder to learn the proper representation of the entire HTML content (including the text body, Javascript, and CSS scripts) associated with the phishing detection use case. The HTML encoder is trained with the transformer-based deep learning architecture. This is inspired by the recent success of state-of-the-art language models, such as BERT and GPT transformer models. Similar to other transformer-based generative pre-training, we use a large number of web pages to train the HTML encoder in an unsupervised fashion. Unlike the BERT and GPT language models, however, the output of the HTML encoder is a two-dimensional ML-generated image. We chose the image output because phishing attacks are designed to use web pages that look similar to the real login pages. The ML-generated images appear to be effective in capturing features relevant to phishing and ignoring irrelevant parts of a web page. Below is an example of an HTML page and the corresponding ML-generated image from the HTML encoder.

The following GIF file shows more examples of the images generated by the HTML encoder. We should keep in mind that our objective is not to generate realistic images from the HTML content. Instead, it is to learn the suitable HTML representation that will be used to train the classification model for phishing detection.

Classification – phishing or not

Once we generate a suitable numerical representation (a vector of numbers) from the HTML content of a web page using the HTML encoder, we then combine it with the embedding of the URL string characters. The resulting numerical values are used as input features and fed into a neural network for final classification. We have collected millions of known phishing web pages and benign pages to train the binary classification model. Since we don’t keep the encoder parameters frozen, the HTML encoder will be fine-tuned toward phishing classification. The trained classifier will determine whether a new web page is phishing or not.

Netskope Threat Protection

The patented phishing website classifier is now part of Netskope Threat Protection, a comprehensive, multi-layered threat protection system powered by AI and machine learning. It enables us to block phishing web pages in real time, because it only needs the page URL string and the HTML content as input, which is readily available in the web traffic that goes through the Netskope secure access service edge (SASE) platform. The phishing classifier has the capability to detect unknown and zero-day phishing attacks, complementing other heuristic and signature-based engines. This classifier has been optimized to scan web pages inline, with an average runtime of less than 10 milliseconds.

To learn more about the multiple layers of threat capabilities that deliver comprehensive threat protection for cloud and web services, please visit Netskope Threat Protection.

The authors would like to acknowledge the significant contributions from Senior Research Scientist Najmeh Miramirkhani on this project.

Yihua Liao

Dr. Yihua Liao is the Head of AI Labs at Netskope. His team develops cutting-edge AI/ML technology to tackle many challenging problems in cloud security.

Mantenha-se informado!

Subscribe for the latest from the Netskope Blog